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Summary. - The Middle East syndrome coronavirus (MERS-CoV) is a recently emerging betacoronavirus 
with high fatality. Recently, dipeptidyle peptidase (CD26, DPP4) was identified as the host cell receptor for 
MERS-CoV. Interestingly, despite of common presence of DPP4 receptors the binding and infection of vari- 
ous cells shows imminent variability. In this report, we provide a tool for prediction of the host tropism of the 
virus based on the host receptor binding interface. We found out that, in the binding of MERS-CoV to cells the 
amino acid residues in lancets 4 and 5 of DPP4 receptor, namely K267, Q286, T288, R317, R336, Q344 A291, 
1294, and 1295 are involved. Changes in these residues correspond to profound decrease in virus binding to 
cells. The nine residues at the interface between the virus spikes and the lancets 4 and 5 of host DPP4 can be 
used as a predictive tool for the host tropism and virus affinity to host cell receptors. 
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Introduction 


Coronaviruses (CoV) are enveloped single-stranded 
RNA viruses that infect human and wide variety of animals 
causing severe respiratory or enteric symptoms (Chang et 
al., 2012; Perlman and Netland, 2009). CoVs that infect 
humans include human alfa, beta and gamma CoV. Severe 
acute respiratory syndrome (SARS) has been associated with 
Betacornovirus genus. 

CoVs are characterized by high recombination frequen- 
cies. The large size of virus genome, unique viral replication, 
the low fidelity of coronavirus-encoded polymerases and 
high recombination accounts for unexpected viral evolu- 
tion of other host infection, changes in clinical signs and 
resistance to therapy or vaccination. The human SARS-CoV 
OC43 has evolved from bovine CoV. Furthermore, porcine 
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respiratory CoV has evolved from a gastrointestinal ancestor 
(Chang et al., 2012; Laude et al., 1993). 

MERS-CoV was initially identified in the Arabian Penin- 
sula in 2012. MERS-CoV was assigned to the Betacoronavi- 
rus genus. The genome of CoV encodes 4 major structural 
proteins; nucleocapsid (N protein), spikes (S protein), mem- 
brane (M) and small envelope proteins (E). The S protein is 
a glycoprotein essential for viral attachment to the cell surface 
receptors. The S protein is cleaved in host cells into S1 and S2 
subunits. $1 protein binds the host receptor, while S2 recep- 
tor mediates membrane fusion (Wang et al., 2013). 

The virus replicates in different hosts using DPP4 as 
a functional receptor (Ohnuma et al., 2013). DPP4 is proved 
to be the only essential receptor for MERS-CoV spikes bind- 
ing to host cells. Therefore, DPP4 constitutes a unique binding 
site for MERS-CoV which differs from the binding receptor 
of SARS-CoV (ACE2, (Wang et al., 2013)). Interestingly, the 
presence of DPP4 receptor in a host cell does not warrant the 
binding and infection with the MERS-CoV. For instance, 
MERS-CoV can replicate in cells of humans, pigs, rabbits and 
non-human primates (Chan et al., 2013). In contrast, despite 
of the expression of DPP4, replication of MERS-CoV was not 
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possible in hamsters (de Wit et al., 2013) and ferrets (Raj et 
al., 2013). Furthermore, transfection of ferret kidney cells 
with functional human DPP4 receptors rendered the cells 
susceptible to infection with MERS-CoV (Raj et al., 2013). 

In this study, bioinformatics approaches combined with 
the predetermined critical factors for binding and infection 
with MERS-CoV were combined to predict the host tropism 
of the newly emerged MERS-CoV. Here, we identify a group 
of amino acids at virus-host receptor interface as a marker for 
efficient binding and subsequent infection with MERS-CoV. 
The provided predictive tool can be used to predict the host 
tropism of MERS-CoV in the surrounding environment of 
an infected region. 


Materials and Methods 


The full-length DPP4 sequences were retrieved from the nu- 
cleotides depository at the National Center for Biotechnology 
Information (NCBI). The potential protein domain contents of 
the retrieved sequences were analyzed at the domain search tools 
at NCBI. BLAST search was used to identify the high homology 
hits. The retrieved sequences of various species were exported 
and analyzed for differences in amino acid composition. Multiple 
sequence alignment was constructed by using Clustal omega tool 
at the European Bioinformatics Institute (EBI). The output file was 
retrieved and manually edited by GeneDoc software. The similarity 
and homology percentage was calculated by Ugene 1.12.2 for mac 
computer. The output alignments from Clustal omega were exam- 
ined by Dendroscope software for creation of phylogenetic trees. 
The phylogenetic tree was created by neighbor-joining method in 
output formats of radial phylogram or circular cladogram. 


Results and Discussion 


The bioinformatics of proteins is a useful strategy in 
determining the biomolecular interactions (Kandeel, 2014; 
Kandeel and Kitade, 2013a,b), host restrictions of virus infec- 
tion (Tonnessen et al., 2013) and viral evolution (Cui et al., 
2013; Liu et al., 2012). In this report, the bioinformatic tools 
were adopted to predict the MERS-CoV host tropism. The 
criteria of prediction were based on: first, the alignment of 
protein sequences of lancets 4 and 5 of DPP4 as well as the 
phylogenetic relations with the human DPP4 (Fig. la and b). 
The second is the alignment of amino acid sequences at the 
interface of interaction (Fig. 1c) between the lancets 4 and 5 
of DPP4 and the spike protein of MERS-CoV. 

The structure of DPP4 shows N-terminal hydrolase and 
C-terminal B-propeller domain composed of 8 lancets. 
Lancets 4 and 5 were found to be the site for binding of the 
S protein of MERS-CoV. Replacement of the lancets 4 and 


5 or mutational changes led to drastic effect on the binding 
and infection with MERS-CoV. 

Phylogentic analysis of lancets 4 and 5 of DPP4 showed 
that non-human primates (more than 98% homology, 
Table 1) and rabbits are highly related to the human DPP4 
followed by bats and rodents as guinea pig, hamster and 
rat. The most divergent DPP4 was that of birds and alligator 
(homology was less than 60%, Table 1). 

The binding interface between S protein and DPP4 is 
composed of polar contacts from hydrophilic residues 
K267, Q286, T288, R317, R336 and Q344 surrounding 
a hydrophobic center formed by A291, L294 and 1295 
(Fig. 1c). Disruption of the mentioned residue interaction 
resulted in profound decrease of virus entry (Wang et al., 
2013). In this report, we assume that MERS-CoV binding 
and replication in a specific host depends on the status of 
the above mentioned nine residues. The high replication of 
MERS-CoV in non-human primates coincides with conser- 
vation of all of the above mentioned residues. Rabbits and 
to a lesser extent pigs showed high residue conservation 
pattern. Therefore, infection with MERS-CoV was possible 
in cells of these animals. Camelids showed a high conserva- 
tion profile, indicating a potential incrimination of camels as 
a host for the virus. In this context, neutralizing antibodies 
against MERS-CoV were detected in camels from Middle 
East (Perera et al., 2013; Reusken et al., 2013). Similar high 
conservation pattern was evident in the sequences from farm 
animals as sheep, goats and cattle. Although bats were highly 
divergent from human DPP4, they showed little changes in 
the described 9 residues. This clarifies the possible role of 
bats in the transfer of MERS-CoV. However, an estimated 
MERS-CoV bat-infection rate was at least 3 folds lower than 
that of SARS-CoV (Memish et al., 2013). Cats and rodents 
showed amino acid replacements in at least half of the above 
mentioned residues. This may explain the low viral load 
in their cell cultures compared with primates (Chan et al., 
2013). Compared to the human DPP4, birds showed the 
highest divergence (Fig. 1b). Furthermore, birds showed 
the greatest changes in the above described marker residues 
(Fig. 1a). In agreement with our assumption, chicken-derived 
cell culture did not support the replication of MERS-CoV 
(Chan et al., 2013). 

In brief, the resistance to MERS-CoV replication was as- 
sociated with significant changes in the amino acid residues 
at the interface of interaction between S protein and lancets 
4 and 5 of DPP4. Experimental measures are needed for 
confirmation of our predictive model. The degree of con- 
servation of the above mentioned residues can be used to 
predict the host tropism of MERS-CoV. These predictions 
might be of a value in prevention and control programs, in 
which the sensitivity and resistance to MERS-CoV infection 
in the surrounding environment can be anticipated. 
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(a) Amino acid sequence alignment of the lancets 4 and 5 of B-propeller domain of DPP4 in different species, (b) phylogenetic tree of various 


animal species based on amino acid sequence of DPP4, (c) structure of MERS-CoV complexed with human DPP4 
The receptor binding domain of MERS-CoV spike (green), extracellular part of DPP4 (turquoise) and interaction interface (white) are shown. 
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Table 1. Characteristics of amino acid sequences of DPP4 in different species 


Scientific name Common name Total score | Query cover E value Identity % Acc. No. 
Homo sapiens Humans 197 100% 4.00E-57 100% CAA43118.1 
Nomascus leucogenys White-cheeked gibbon 195 100% 2.00E-56 99% XP_003266219.1 
Gorilla gorilla Gorilla 193 100% 8.00E-56 98% XP_004032754.1 
Macaca mulatta Rhesus monkey 193 100% 9.00E-56 98% NP_001034279.1 
Macaca fascicularis Long-tailed Macaque 193 100% 9.00E-56 98% XP_005573375.1 
Papio anubis Baboon 193 100% 1.00E-55 98% XP_003907588.1 
Pan troglodytes Chimpanzee 193 100% 1.00E-55 99% XP_515858.2 
Oryctolagus cuniculus Rabbit 188 100% 5.00E-54 94% XP_002712206.1 
Equus caballus Horse 179 100% 7.00E-51 88% XP_005601601.1 
Ceratotherium simum Rhinoceros 175 100% 2.00E-49 87% XP_004428321.1 
Loxodonta Africana Elephant 174 100% 2.00E-48 85% XP_003406047.1 
Trichechus manatus See cow 164 100% 2.00E-45 81% XP_004375482.1 
Bos Taurus Cow 155 100% 6.00E-44 77% DAA32742.1 
Cavia porcellus Guinea pig 160 100% 1.00E-43 80% XP_003478612.2 
Capra hircus Goat 155 100% 1.00E-42 77% XP_005676104.1 
Cricetulus griseus Hamster 154 100% 1.00E-42 73% EGW01899.1 
Myotis lucifugus Brown bat 156 98% 2.00E-42 78% XP_006083275.1 
Ovis aries Sheep 156 100% 2.00E-42 77% XP_004004709.1 
Dasypus novemcinctus Armadillo 154 100% 6.00E-42 76% XP_004464464.1 
Camelus ferus Camel 153 100% 2.00E-41 75% XP_006176870.1 
Myotis brandtii Vesper bat 153 98% 3.00E-41 77% EPQ03437.1 
Pipistrellus pipistrellus Common pipistrelle bat 151 98% 2.00E-40 75% AGF80256.1 
Sus scrofa Pig 148 98% 2.00E-39 74% NP_999422.1 
Orcinus orca Killer whale 147 100% 3.00E-39 73% XP_004283669.1 
Felis catus Cat 146 100% 8.00E-39 71% NP_001009838.1 
Ailuropoda melanoleuca Panda 144 100% 9.00E-38 70% XP_002924912.1 
Mustela putorius furo Ferret 130 100% 5.00E-33 63% ABC72084.1 
Columba livia Pigeon 107 98% 1.00E-24 55% XP_005498754.1 
Falco cherrug Falcon 105 98% 4.00E-24 51% XP_005443040.1 
Ovophis okinavensis Pit viper 104 100% 6.00E-24 54% BAN82157.1 
Alligator sinensis Alligator 100 98% 1.00E-22 52% XP_006037514.1 
Pseudopodoces humilis Ground tit 100 98% 1.00E-22 51% XP_005520053.1 
Taeniopygia guttata Zebra finch 99 98% 5.00E-22 50% XP_004176799.1 
Gallus gallus Fowl 94 82% 4.00E-20 56% NP_001026426.1 
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